Overview

Brought to you by YData

Dataset statistics

Number of variables15
Number of observations891
Missing cells1043
Missing cells (%)7.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory326.6 KiB
Average record size in memory375.3 B

Variable types

Numeric6
Categorical6
Text3

Alerts

Age is highly overall correlated with AgeBinHigh correlation
AgeBin is highly overall correlated with AgeHigh correlation
FamilySize is highly overall correlated with Fare and 2 other fieldsHigh correlation
Fare is highly overall correlated with FamilySize and 1 other fieldsHigh correlation
HasCabin is highly overall correlated with Fare and 1 other fieldsHigh correlation
Parch is highly overall correlated with FamilySizeHigh correlation
Pclass is highly overall correlated with HasCabinHigh correlation
Sex is highly overall correlated with SurvivedHigh correlation
SibSp is highly overall correlated with FamilySizeHigh correlation
Survived is highly overall correlated with SexHigh correlation
Age has 177 (19.9%) missing values Missing
Cabin has 687 (77.1%) missing values Missing
AgeBin has 177 (19.9%) missing values Missing
PassengerId is uniformly distributed Uniform
PassengerId has unique values Unique
Name has unique values Unique
SibSp has 608 (68.2%) zeros Zeros
Parch has 678 (76.1%) zeros Zeros
Fare has 15 (1.7%) zeros Zeros
FamilySize has 537 (60.3%) zeros Zeros

Reproduction

Analysis started2025-05-27 07:37:25.821517
Analysis finished2025-05-27 07:37:31.668268
Duration5.85 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

Uniform  Unique 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-05-27T13:07:31.778655image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.35384
Coefficient of variation (CV)0.57702655
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotonicityStrictly increasing
2025-05-27T13:07:31.933850image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
599 1
 
0.1%
588 1
 
0.1%
589 1
 
0.1%
590 1
 
0.1%
591 1
 
0.1%
592 1
 
0.1%
593 1
 
0.1%
594 1
 
0.1%
595 1
 
0.1%
Other values (881) 881
98.9%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
891 1
0.1%
890 1
0.1%
889 1
0.1%
888 1
0.1%
887 1
0.1%
886 1
0.1%
885 1
0.1%
884 1
0.1%
883 1
0.1%
882 1
0.1%

Survived
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
0
549 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Length

2025-05-27T13:07:32.089501image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-05-27T13:07:32.212785image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring characters

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Pclass
Categorical

High correlation 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
3
491 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Length

2025-05-27T13:07:32.325402image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-05-27T13:07:32.434035image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Name
Text

Unique 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
2025-05-27T13:07:32.654579image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length82
Median length52
Mean length26.965208
Min length12

Characters and Unicode

Total characters24026
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)100.0%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry
ValueCountFrequency (%)
mr 521
 
14.4%
miss 182
 
5.0%
mrs 129
 
3.6%
william 64
 
1.8%
john 44
 
1.2%
master 40
 
1.1%
henry 35
 
1.0%
george 24
 
0.7%
james 24
 
0.7%
charles 23
 
0.6%
Other values (1515) 2538
70.0%
2025-05-27T13:07:33.061114image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 24026
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 24026
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 24026
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r 1958
 
8.1%
e 1703
 
7.1%
a 1657
 
6.9%
i 1325
 
5.5%
n 1304
 
5.4%
s 1297
 
5.4%
M 1128
 
4.7%
l 1067
 
4.4%
o 1008
 
4.2%
Other values (50) 8844
36.8%

Sex
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size53.8 KiB
male
577 
female
314 

Length

Max length6
Median length4
Mean length4.704826
Min length4

Characters and Unicode

Total characters4192
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 577
64.8%
female 314
35.2%

Length

2025-05-27T13:07:33.368152image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-05-27T13:07:33.490298image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
male 577
64.8%
female 314
35.2%

Most occurring characters

ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4192
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4192
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4192
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1205
28.7%
m 891
21.3%
a 891
21.3%
l 891
21.3%
f 314
 
7.5%

Age
Real number (ℝ)

High correlation  Missing 

Distinct88
Distinct (%)12.3%
Missing177
Missing (%)19.9%
Infinite0
Infinite (%)0.0%
Mean29.699118
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-05-27T13:07:33.621769image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile4
Q120.125
median28
Q338
95-th percentile56
Maximum80
Range79.58
Interquartile range (IQR)17.875

Descriptive statistics

Standard deviation14.526497
Coefficient of variation (CV)0.48912219
Kurtosis0.17827415
Mean29.699118
Median Absolute Deviation (MAD)9
Skewness0.38910778
Sum21205.17
Variance211.01912
MonotonicityNot monotonic
2025-05-27T13:07:33.780046image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 30
 
3.4%
22 27
 
3.0%
18 26
 
2.9%
28 25
 
2.8%
30 25
 
2.8%
19 25
 
2.8%
21 24
 
2.7%
25 23
 
2.6%
36 22
 
2.5%
29 20
 
2.2%
Other values (78) 467
52.4%
(Missing) 177
 
19.9%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 7
0.8%
2 10
1.1%
3 6
0.7%
4 10
1.1%
5 4
 
0.4%
ValueCountFrequency (%)
80 1
 
0.1%
74 1
 
0.1%
71 2
0.2%
70.5 1
 
0.1%
70 2
0.2%
66 1
 
0.1%
65 3
0.3%
64 2
0.2%
63 2
0.2%
62 4
0.4%

SibSp
Real number (ℝ)

High correlation  Zeros 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52300786
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-05-27T13:07:33.910633image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1027434
Coefficient of variation (CV)2.1084644
Kurtosis17.88042
Mean0.52300786
Median Absolute Deviation (MAD)0
Skewness3.6953517
Sum466
Variance1.2160431
MonotonicityNot monotonic
2025-05-27T13:07:34.051965image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
8 7
 
0.8%
5 5
 
0.6%
4 18
 
2.0%
3 16
 
1.8%
2 28
 
3.1%
1 209
 
23.5%
0 608
68.2%

Parch
Real number (ℝ)

High correlation  Zeros 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38159371
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-05-27T13:07:34.174133image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80605722
Coefficient of variation (CV)2.1123441
Kurtosis9.7781252
Mean0.38159371
Median Absolute Deviation (MAD)0
Skewness2.749117
Sum340
Variance0.64972824
MonotonicityNot monotonic
2025-05-27T13:07:34.301628image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.4%
6 1
 
0.1%
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.4%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.4%
3 5
 
0.6%
2 80
 
9.0%
1 118
 
13.2%
0 678
76.1%

Ticket
Text

Distinct681
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Memory size55.6 KiB
2025-05-27T13:07:34.538166image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length18
Median length17
Mean length6.7508418
Min length3

Characters and Unicode

Total characters6015
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique547 ?
Unique (%)61.4%

Sample

1st rowA/5 21171
2nd rowPC 17599
3rd rowSTON/O2. 3101282
4th row113803
5th row373450
ValueCountFrequency (%)
pc 60
 
5.3%
c.a 27
 
2.4%
a/5 17
 
1.5%
ca 14
 
1.2%
ston/o 12
 
1.1%
2 12
 
1.1%
sc/paris 9
 
0.8%
w./c 9
 
0.8%
soton/o.q 8
 
0.7%
347082 7
 
0.6%
Other values (709) 955
84.5%
2025-05-27T13:07:34.911739image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6015
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6015
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6015
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Fare
Real number (ℝ)

High correlation  Zeros 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.204208
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-05-27T13:07:35.071458image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.693429
Coefficient of variation (CV)1.5430725
Kurtosis33.398141
Mean32.204208
Median Absolute Deviation (MAD)6.9042
Skewness4.7873165
Sum28693.949
Variance2469.4368
MonotonicityNot monotonic
2025-05-27T13:07:35.226237image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
0 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.4%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.4%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

Cabin
Text

Missing 

Distinct147
Distinct (%)72.1%
Missing687
Missing (%)77.1%
Memory size33.7 KiB
2025-05-27T13:07:35.472717image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length15
Median length3
Mean length3.5882353
Min length1

Characters and Unicode

Total characters732
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)49.5%

Sample

1st rowC85
2nd rowC123
3rd rowE46
4th rowG6
5th rowC103
ValueCountFrequency (%)
c23 4
 
1.7%
c27 4
 
1.7%
g6 4
 
1.7%
b96 4
 
1.7%
b98 4
 
1.7%
f 4
 
1.7%
c25 4
 
1.7%
f33 3
 
1.3%
e101 3
 
1.3%
f2 3
 
1.3%
Other values (151) 201
84.5%
2025-05-27T13:07:35.870310image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 732
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 732
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 732
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 72
 
9.8%
C 71
 
9.7%
B 64
 
8.7%
1 61
 
8.3%
3 59
 
8.1%
6 51
 
7.0%
5 45
 
6.1%
4 37
 
5.1%
8 37
 
5.1%
34
 
4.6%
Other values (9) 201
27.5%

Embarked
Categorical

Distinct3
Distinct (%)0.3%
Missing2
Missing (%)0.2%
Memory size50.6 KiB
S
644 
C
168 
Q
77 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters889
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowC
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 644
72.3%
C 168
 
18.9%
Q 77
 
8.6%
(Missing) 2
 
0.2%

Length

2025-05-27T13:07:36.013929image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-05-27T13:07:36.116971image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
s 644
72.4%
c 168
 
18.9%
q 77
 
8.7%

Most occurring characters

ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 889
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 889
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 889
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 644
72.4%
C 168
 
18.9%
Q 77
 
8.7%

FamilySize
Real number (ℝ)

High correlation  Zeros 

Distinct9
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.90460157
Minimum0
Maximum10
Zeros537
Zeros (%)60.3%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-05-27T13:07:36.229406image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum10
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.6134585
Coefficient of variation (CV)1.7836124
Kurtosis9.159666
Mean0.90460157
Median Absolute Deviation (MAD)0
Skewness2.7274415
Sum806
Variance2.6032485
MonotonicityNot monotonic
2025-05-27T13:07:36.351220image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 537
60.3%
1 161
 
18.1%
2 102
 
11.4%
3 29
 
3.3%
5 22
 
2.5%
4 15
 
1.7%
6 12
 
1.3%
10 7
 
0.8%
7 6
 
0.7%
ValueCountFrequency (%)
0 537
60.3%
1 161
 
18.1%
2 102
 
11.4%
3 29
 
3.3%
4 15
 
1.7%
5 22
 
2.5%
6 12
 
1.3%
7 6
 
0.7%
10 7
 
0.8%
ValueCountFrequency (%)
10 7
 
0.8%
7 6
 
0.7%
6 12
 
1.3%
5 22
 
2.5%
4 15
 
1.7%
3 29
 
3.3%
2 102
 
11.4%
1 161
 
18.1%
0 537
60.3%

HasCabin
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
0
687 
1
204 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Length

2025-05-27T13:07:36.474320image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-05-27T13:07:36.579470image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring characters

ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 687
77.1%
1 204
 
22.9%

AgeBin
Categorical

High correlation  Missing 

Distinct5
Distinct (%)0.7%
Missing177
Missing (%)19.9%
Memory size1.2 KiB
(18, 40]
425 
(40, 60]
128 
(10, 18]
75 
(0, 10]
64 
(60, 80]
 
22

Length

Max length8
Median length8
Mean length7.9103641
Min length7

Characters and Unicode

Total characters5648
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row(18, 40]
2nd row(18, 40]
3rd row(18, 40]
4th row(18, 40]
5th row(18, 40]

Common Values

ValueCountFrequency (%)
(18, 40] 425
47.7%
(40, 60] 128
 
14.4%
(10, 18] 75
 
8.4%
(0, 10] 64
 
7.2%
(60, 80] 22
 
2.5%
(Missing) 177
19.9%

Length

2025-05-27T13:07:36.698269image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-05-27T13:07:36.820284image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
40 553
38.7%
18 500
35.0%
60 150
 
10.5%
10 139
 
9.7%
0 64
 
4.5%
80 22
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 928
16.4%
( 714
12.6%
, 714
12.6%
714
12.6%
] 714
12.6%
1 639
11.3%
4 553
9.8%
8 522
9.2%
6 150
 
2.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5648
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 928
16.4%
( 714
12.6%
, 714
12.6%
714
12.6%
] 714
12.6%
1 639
11.3%
4 553
9.8%
8 522
9.2%
6 150
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5648
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 928
16.4%
( 714
12.6%
, 714
12.6%
714
12.6%
] 714
12.6%
1 639
11.3%
4 553
9.8%
8 522
9.2%
6 150
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5648
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 928
16.4%
( 714
12.6%
, 714
12.6%
714
12.6%
] 714
12.6%
1 639
11.3%
4 553
9.8%
8 522
9.2%
6 150
 
2.7%

Interactions

2025-05-27T13:07:30.465605image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:26.458639image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.182369image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.962345image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.673799image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:29.433301image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.575219image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:26.573345image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.312449image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.086934image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.807303image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:29.555874image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.696331image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:26.695829image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.466366image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.198533image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.933311image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:29.981836image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.812446image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:26.825176image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.603410image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.320060image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:29.064560image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.113983image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.934085image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:26.947636image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.722030image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.432667image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:29.183474image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.237409image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:31.041198image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.065671image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:27.838670image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:28.547773image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:29.306376image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-05-27T13:07:30.350598image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Correlations

2025-05-27T13:07:36.920982image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
AgeAgeBinEmbarkedFamilySizeFareHasCabinParchPassengerIdPclassSexSibSpSurvived
Age1.0000.8580.065-0.2280.1350.258-0.2540.0410.2690.099-0.1820.155
AgeBin0.8581.0000.0290.2570.0750.2080.2750.0000.2280.1260.2520.113
Embarked0.0650.0291.0000.0830.1960.2280.0520.0000.2600.1130.0920.166
FamilySize-0.2280.2570.0831.0000.5290.0700.801-0.0500.1370.2050.8490.215
Fare0.1350.0750.1960.5291.0000.5820.410-0.0140.4790.1890.4470.283
HasCabin0.2580.2080.2280.0700.5821.0000.0910.0630.7900.1340.1380.313
Parch-0.2540.2750.0520.8010.4100.0911.0000.0010.0220.2470.4500.157
PassengerId0.0410.0000.000-0.050-0.0140.0630.0011.0000.0320.066-0.0610.104
Pclass0.2690.2280.2600.1370.4790.7900.0220.0321.0000.1300.1480.337
Sex0.0990.1260.1130.2050.1890.1340.2470.0660.1301.0000.2060.540
SibSp-0.1820.2520.0920.8490.4470.1380.450-0.0610.1480.2061.0000.187
Survived0.1550.1130.1660.2150.2830.3130.1570.1040.3370.5400.1871.000

Missing values

2025-05-27T13:07:31.205868image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
A simple visualization of nullity by column.
2025-05-27T13:07:31.425903image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-05-27T13:07:31.594041image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedFamilySizeHasCabinAgeBin
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS10(18.0, 40.0]
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.010PC 1759971.2833C85C11(18.0, 40.0]
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS00(18.0, 40.0]
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S11(18.0, 40.0]
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS00(18.0, 40.0]
5603Moran, Mr. JamesmaleNaN003308778.4583NaNQ00NaN
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S01(40.0, 60.0]
7803Palsson, Master. Gosta Leonardmale2.03134990921.0750NaNS40(0.0, 10.0]
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS20(18.0, 40.0]
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC10(10.0, 18.0]
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedFamilySizeHasCabinAgeBin
88188203Markun, Mr. Johannmale33.0003492577.8958NaNS00(18.0, 40.0]
88288303Dahlberg, Miss. Gerda Ulrikafemale22.000755210.5167NaNS00(18.0, 40.0]
88388402Banfield, Mr. Frederick Jamesmale28.000C.A./SOTON 3406810.5000NaNS00(18.0, 40.0]
88488503Sutehall, Mr. Henry Jrmale25.000SOTON/OQ 3920767.0500NaNS00(18.0, 40.0]
88588603Rice, Mrs. William (Margaret Norton)female39.00538265229.1250NaNQ50(18.0, 40.0]
88688702Montvila, Rev. Juozasmale27.00021153613.0000NaNS00(18.0, 40.0]
88788811Graham, Miss. Margaret Edithfemale19.00011205330.0000B42S01(18.0, 40.0]
88888903Johnston, Miss. Catherine Helen "Carrie"femaleNaN12W./C. 660723.4500NaNS30NaN
88989011Behr, Mr. Karl Howellmale26.00011136930.0000C148C01(18.0, 40.0]
89089103Dooley, Mr. Patrickmale32.0003703767.7500NaNQ00(18.0, 40.0]